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Abstract 

We discuss some of the challenges facing shared autonomy. In particular, 
we explore (via the methods of interacting Gaussian process models (IGP), 
[Trautman, 2013] ) 

1. shared autonomy over unreliable networks, 

2. how we can model individual human operators (in contrast to the 
“average” of a human operator), and 

3. how IGP naturally models and integrates sliding autonomy into the 
joint human-machine system. 

We include a Background Section (Section [4]) for completeness. 


1 Prelude 

We begin by recalling the assistive teleoperation model of Section |4j 

I ) = ^/’(h,P^))p(h | \ (1.1) 

and we list the meaning of each quantity: 

• is the trajectory of the robot through some state space. For ground 
robots, a common choice for the state space is [a;(t), j/(t),d(t)]; for air 
vehicles the state could be 

[x{t), y{t),z{t),roll{t),pitch{t),e{t)]. 

This trajectory is a-priori modeled as a random function distributed ac¬ 
cording to a Gaussian Process ( [Rasmussen and Williams, 2006| ), 

- GP(P^);0,fc), (1.2) 

which can be trained offline using input-output examples of the robot’s 
kinematics. 

Online measurements of the state of the robot z) update the GP to 


(1.3) 






(assuming that the data Zi^Li has already arrived) where is the new 
mean and is the new covariance function of the GP (by “new”, we 
mean after incorporation of the new data ). 

Importantly, this model allows nonparametric probabilistic prediction of 
the trajectory into the future; that is, 

[l:T]^{x,y,e) 

where 1 < t < T. Indeed, T can be as large as one likes (corresponding 
to how far into the future one predicts); because f(i?) 
a continuous measure of the uncertainty {t) exists, although it grows 
quite large the further one predicts into the future. 

We remark on the following: the structure of p(h, | is such 

that the individual robot model p(f^^i | zj^^) changes with each new data 
point Zj^^; in particular, since GPs are nonparametric models, they have 
the ability to capture some amount of online nonlinearities (such as motor 
failures, terrain changes, etc). The extent to which this is true needs to 
be explored, however. 

• h is the trajectory of the human operator through some state space. While 
the state space of the robot can typically be well characterized with phys¬ 
ical models, the state space of the humar0 is not immediately clear. 

As a first step, however, we choose the set of operator commands to the 
robot to correspond to the state of the human; accordingly, we treat the 
operator input as measurements of the human trajectory through this 
input space. In essence, we are regarding the human state as manifesting 
via the commands to the robot; extrapolating, we assume that one can 
predict where the human will go in the “command space” (appropriately 
hedged using probability densities). 

As a simple example, if the human is operating a joystick that sends 
velocity commands Vx,Vy to the robot’s actuators, then 

Z^'*^ = {V^,Vy). 

Furthermore, just as for the robot, the human trajectory is a-priori mod¬ 
eled as a random function distributed according to a Gaussian Process 
( Rasmussen and Williams, 2006| ), 

h~GP(h;0,fc). (1.4) 

New measurements of the state of the human z^^^ update the GP to 

P(h I = GP{h-m[^i,k[’!^) 

^see Section ?? for a more nuanced discussion of learning the human state space. 


(1.5) 






(assuming that the data has already arrived) where is the new 

mean and is the new covariance function of the GP (by “new”, we 
mean after incorporation of the new data 

In this way, we can predict what we expect the operator to do using 
probabilistic inference (just as was done with the robot via p(f^^) | 
thereby enabling joint models of the human-robot team that anticipate 
future situations (at times T > t) by making decisions now, at time t. 

is the interaction function between the human and the robot. 
In SectionlH a particular choice of this function of this choice is discussed. 

With this model, control of the remote vehicle is accomplished in a receding 
horizon fashion; upon receipt of a measurement zt = (z|^\z|^^), the model 
p(h,f(-«) I is updated, and the new navigation protocol is taken to be 


(h,f^^^)( =argmax p(h,f^^^ | z 
h.f(«) L 


(G , 

l:t 1 ' 




We then take -|- 1) as the next action in the path (where t -I- 1 means 

the next step of the optimal robot trajectory through the joint human-robot 
space). At t -I- 1, we receive observations (z[^\, z|;^{), update the distribution to 
p(h, I zj^^^), find the MAP, and choose f('^)*(f-|-2) as the next step. 

This process repeats until the human-robot team arrives at the destination. 


2 Shared Autonomy over Unreliable Networks 

We focus here on a few particular aspects of the control of a remote vehicle 
over unreliable networks: because operator commands can often arrive late, 
be dropped, or be otherwise corrupted across an arbitrary network, velocity 
commands such as {vx,Vy) cannot be literally interpreted. 

2.1 Laggy networks 

Imagine that the operator is viewing an onboard video feed that is 1 second old, 
due to communication constraints. Additionally, imagine that the command 
{vx,Vy) takes 1 second to return to the remote vehicle. Thus, the command 
received onboard the vehicle is 2 seconds old. Clearly, this information is stale, 
and if interpreted by the vehicle literally, could destabilize control. 

However, these commands, while stale, are not devoid of information; we 
suggest instead that the inputs be treated as measurements z(‘_2 (if t is in 
seconds) of the human-machine system. This suggests that a likelihood be 
placed on the commands 


I h). 


(2.1) 




If the current time on the remote vehicle is t, then the distribution over the 
human trajectory gets updated to (assuming all measurements prior to t — 2 
have been received in a timely fashion) 


p(h I zi:t_2) = GP(h;mJ^jL2,fcnt-2)- 

We now update the navigation distribution to 

I z^^jL2,Zm )) = V'(h,f^^^)p(h | z^^jL2)p(f^^^ I Zm )• 


However, because we are modeling trajectories of the human, the model can 
naturally incorporate delayed receipt: the data at t — 2 informs the distribution 
p(h I but when inference is done at time t, additional uncertainty has 

accumulated. Thus, when we extract the navigation protocol using 


(h, = argmax 

h.f(«) 




( 2 . 2 ) 


the distribution around the human trajectory at time t is less peaked, and so 
is treated as less informative when evaluating (h,f(^))( (which is the actual 
movement the robot executes at time t). 


2.2 Lossy networks 

Lossy networks are treated in an identical manner: indeed, if measurement z[,^^ 
is missing (where 1 < fc < t), then our navigation protocol is still 


= arg max 'i/’(h, f*''^^)p(h 

h,f(«) 


,{h) 


)p(f(«) 
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Again, the effect on the performance of the system is gradual: as more mea¬ 
surements go missing (or are delayed), the less informative p(h | zj^^^) is, and 
the more the onboard autonomy is trusted (we thus have a natural formulation 
of sliding autonomy: see Section [3]). 

We emphasize that how well this approach performs is tied strongly to the 
fidelity of the likelihood function \ h) (as is the case with any Bayesian 

approach); an overconfident measurement model can lead to overly confident 
human trajectory models, which can place too much weight on incorrect human 
input. Under confident models will tend to overtrust the onboard autonomy, 
thus potentially leading to a robot that does not follow the orders of the op¬ 
erator. Nevertheless, the presence of an uncertain network forces us to treat 
operator inputs as probabilistic quantities, rather than deterministic ones. 


3 Generalizing Sliding Autonomy 

While methods of sliding autonomy have been explored for a wide variety of 
tasks (see [Dias et ah, 2008|), the amount of autonomy allocated to the robot 








(or robots) or the operator (or operators) is typically implemented in a manner 
independent of the human-robot team; that is, some independent estimation 
algorithm determines how much control each entity receives, and then that 
number is fed into an algorithm that mixes a weighted combination of each 
intelligence. 

We argue here that our approach integrates the allocation of autonomy and 
the actual mixing of the multiple intelligences in a single step. In particular, we 
revisit our model of assistive teleoperation 


This model contains an online model of the human operator p(h | and the 
robot I zj^^). In Section [U we discussed how both the human opera¬ 

tor model can respond online to varying network conditions; in Section ?? we 
discussed how both the human and robot models can learn, in an online fash¬ 
ion, peculiarities of the individual operator or individual robot (peculiarities 
of the individual with respect to general psychological theories or CAD based 
kinematic models, respectively). 

More generally, these individual models maintain a measure of uncertainty 
about the current state of operator or robot—this uncertainty can naturally 
be interpreted as proportional to the inverse of how much autonomy should be 
allocated to each entity. The model thus contains an implicit measure of sliding 
autonomy, which is a natural artifact of the IGP model. 

Perhaps more importantly, however, is that this measure of sliding autonomy 
is incorporated into the final action in a probabilistic fashion: should the un¬ 
certainty become large around either intelligence (due to an unreliable network, 
uncharacteristic behavior, or any other number of anomalies), then the amount 
of confidence placed in that intelligence becomes reduced upon blending in the 
function '0(h, Mathematically, as p(h | zj!*^) (or | z^^^)) becomes 

more diffuse, its effect on '0 becomes less pronounced. 

One can only extract so much information from any system, and when both 
distributions become diffuse, the overall information content is very low, and 
so navigation should start to degrade. The best we can hope for is a graceful 
degradation. 

Succinctly, the final action taken by the remote vehicle is given by 


(h,f(^)): = argmax |p(h,f^^^ 
h.f(K) L 


Ah) (R) 


Effectively, sliding autonomy (or blended autonomy) is a natural by-product of 
our formulation: an implicit measure of blending exists in the individual mod¬ 
els, while the uncertainty of those individual models feeds into the interaction 
function. 




4 Background 

4.1 Blended Autonomy as an Extension of mgIGP 

Current theories of shared autonomy are dominated by anecdotal evidence and 
heuristic guidelines. In [Hardin and Goodrich, 2009] the three recognized levels 
of autonomy are listed: adaptive (the agent adjudicates), adjustable (the super¬ 
visor adjudicates), and mixed-initiative (the agent and supervisor “collaborate 
to maintain the best perceived level of autonomy”). In [Fiore et ah, 2011] , hu¬ 
man robot collaboration schemas are organized around social, organizational 
and cultural factors, and in [Arkin et ah, 200^ the role of ethological and emo¬ 
tional models in human-robot interaction are examined. Furthermore, actual 
implementations are typically designed around need, rather than principle ( [Murphy, 20I0[ ) 
either the remote human operator retains complete control of the robot, or the 
human operator makes online decisions about the amount of autonomy the robot 
is given. 

Importantly, the work of [Dragan and Srinivasa, 2012[ introduces principled 
user goal inference and prediction methods, combined with an arbitration step to 
balance user input and robot intelligence. However, our approach to shared au¬ 
tonomy as an extension of multiple goal interacting Gaussian processes (mgIGP) 

(see [Trautman, 20I3[ ) unifies the three steps of [Dragan and Srinivasa, 20I2| , 
thus providing a more straightforward framework in which to understand the 
fusion of human and machine intelligence. 

We also propose that extending mgIGP could provide a novel mathematical 
formulation of shared autonomy (which we call blended autonomy). First, re¬ 
call the mgIGP model of [Trautman and Krause, 2010[ [Trautman et ah, 2013 . 

Next, suppose a human operator is controlling the robot from a remote location, 
so the robot is no longer fully autonomous (we continue the narrative of a robot 
navigating through a crowd of n individuals f = ..., f*^"))). However, 

rather than treating the human commands as system interrupts, we wish to 
understand the continuum of blended autonomy in a mathematical way. Using 
the navigation protocol derived using p(f('^\f | Zi,*) as motivation, we could 
model the joint human operator-robot system as 

p(h,f(^),f I Zi,t) = ’^^ p(h I Zi:t)p(f(^) I Zi,t)]^p(fW I (4.1) 

i=l 

where h is the is the human operator’s predicted interests, modeled with a 
Gaussian process mixture p(h | The measurement data is now Zi:t = 

(zj'.t, Zi-t,..., Zi'-t) where are the human operator commands sent from 
time 1 : t. Additionally, ^/>(h, f) is the interaction function between the 
human operator, robot, and human crowd. One concrete instantiation of this 
interaction function is 

V-(h,f(«),f) = V-h(h,f(«))V'f(f^^\f) (4.2) 

where (f^^\ f) is the cooperation function from the model p(f f | zi,*) and 
Uh(h, is an “attraction” model between the operator commands and the 






















robot path. One possible attraction model is 


=exp((h-f(^))^S~i(h-f(^))) . (4.3) 

Thus, the operator’s intentionality h and the robot’s planned path are 
merged —this formulation of ^/'h(h, gives high weight to paths h and 
that are similar, while the probability of dissimilar paths decreases exponen¬ 
tially. Bear in mind, however, that f) still gives high weight to paths f 

and that cooperate. All of this is balanced against the (predicted) individ¬ 
ual intentionality encoded in the Gaussian process mixtures p(f(*) | z^*j). 

As with mgIGP, the model p(h, | Zi:t) suggests a natural way to 

interpret blended autonomy (or blended decision making): at time t, find the 
MAP assignment for the posterior 

(h,f(^\f)* = argmaxp(h,f(^\f | Zi,t), (4.4) 

h,f(R).f 

and then take -|- 1) as the next robot action. As new measurements 

arrive, compute a new plan by recalculating the MAP of the blended autonomy 
density. By choosing to interpret navigation under the model 

p(h,f(^),f I zi:t), (4.5) 

blended autonomy in complex environments is modeled in a transparent way: 
human commands are statistically weighted against machine intelligence in a 
receding horizon framework. 

The key insight is that by modeling the joint human-robot system, we can 
blend human and robot capabilities in a single step to produce a superior system 
level decision. When the human system and the robot system are modeled 
independently, it becomes unclear how to fuse the complementary proficiencies 
of the human and robot agents. 
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